keywords:"web archiving" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"web archiving"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Automatic Webpage Reconstruction Serečun, Viliam ; Ryšavý, Ondřej (referee) ; Veselý, Vladimír (advisor) Many legal institutions require a burden of proof regarding web content. This thesis deals with a problem connected to web reconstruction and archiving. The primary goal is to provide an open source solution, which will satisfy legal institutions with their requirements. This work presents two main products. The first is a framework, which is a fundamental building block for developing web scraping and web archiving applications. The second product is a web application prototype. This prototype shows the framework utilization. The application output is MAFF archive file which comprises a reconstructed web page, web page screenshot, and meta information table. This table shows information about collected data, server information such as IP addresses and ports of a device where is the original web page located, and time stamp. Detailed record
	Web Page Archiving Tools Kvačkaj, Matúš ; Rychlý, Marek (referee) ; Burget, Radek (advisor) This bachelor thesis deals with the issue of archiving and reproduction of web pages. The aim was to provide a tool that, after specifying the URL and parameters, creates an archive in WARC format of a given page and also generates its textual description, suitable for further processing and analysis. The tool also supports the reverse process - replaying a site from a WARC archive and generating a textual description of the page. When implementing the tool, it was intended that it would be applied to an existing dataset and would be part of a bulk data processing. The Webis-Web-Archive-17 dataset was used, which contains approximately 10,000 WARC archives collected since 2017. To ensure maximum portability of the tool, Docker containerization was used. Detailed record
	Long-term Preservation of Web Content Kvasnica, Jaroslav ; Pokorný, Jan (advisor) ; Souček, Martin (referee) This work describes the long term preservation of digital documents, particularly websites. The aim of this work is to give an explanation of the long term preservation, to define the differences between various approaches and to describe long term preservation of web content possibilities such as migration or emulation. It also explains risks and challenges of these strategies. It discusses new problems which the long term preservation aim leads to. It also describes possible solutions as well as it describes the situation in selected significant foreign institutions. The main aim of this work is detailed analysis of long term preservation strategy in theNational Library of the Czech Republic, which is the only institution engaged in the preservation of Czech web. The process of data preparation, metadata creation and data storing in the long term repository of the Czech National Library is thoroughly described, including examples and their explanation. Future actions of long term preservation in the Czech Web Archive are articulated in the conclusion. Powered by TCPDF (www.tcpdf.org) Detailed record
	Automatic Webpage Reconstruction Serečun, Viliam ; Ryšavý, Ondřej (referee) ; Veselý, Vladimír (advisor) Many legal institutions require a burden of proof regarding web content. This thesis deals with a problem connected to web reconstruction and archiving. The primary goal is to provide an open source solution, which will satisfy legal institutions with their requirements. This work presents two main products. The first is a framework, which is a fundamental building block for developing web scraping and web archiving applications. The second product is a web application prototype. This prototype shows the framework utilization. The application output is MAFF archive file which comprises a reconstructed web page, web page screenshot, and meta information table. This table shows information about collected data, server information such as IP addresses and ports of a device where is the original web page located, and time stamp. Detailed record
	Long-term Preservation of Web Content Kvasnica, Jaroslav ; Pokorný, Jan (advisor) ; Souček, Martin (referee) This work describes the long term preservation of digital documents, particularly websites. The aim of this work is to give an explanation of the long term preservation, to define the differences between various approaches and to describe long term preservation of web content possibilities such as migration or emulation. It also explains risks and challenges of these strategies. It discusses new problems which the long term preservation aim leads to. It also describes possible solutions as well as it describes the situation in selected significant foreign institutions. The main aim of this work is detailed analysis of long term preservation strategy in theNational Library of the Czech Republic, which is the only institution engaged in the preservation of Czech web. The process of data preparation, metadata creation and data storing in the long term repository of the Czech National Library is thoroughly described, including examples and their explanation. Future actions of long term preservation in the Czech Web Archive are articulated in the conclusion. Powered by TCPDF (www.tcpdf.org) Detailed record
	Comparative Analysis of WebArchiv of the National Library of the Czech Republic and Foreign Projects Kupcová, Pavla ; Římanová, Radka (advisor) ; Bratková, Eva (referee) (in English) The topic of the diploma thesis is to compare the WebArchiv with selected foreign Web Archives, which are responsible for preserving the national cultural heritage. The introduction briefly explains the history of Web Archives and typology of harvesting. Next parts deal with the history, legal aspects of archiving, selected types of harvesting, Web resources, systems, accessing and evaluation the Czech (WebArchiv), Australian (Pandora) and British archive (United Kingdom Web Archive). The text continues with an evaluation of the selected archives that mentions strong and weak properties and possible solutions. In conclusion, outlines the problematic aspects of archiving, which must be addressed in the future. [Author's abstract] Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English